Picture for Zhewen Tan

Zhewen Tan

ESPO: Early-Stopping Proximal Policy Optimization

Add code
May 28, 2026
Viaarxiv icon

Harness-Bench: Measuring Harness Effects across Models in Realistic Agent Workflows

Add code
May 27, 2026
Viaarxiv icon

MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

Add code
May 22, 2026
Viaarxiv icon

TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment

Add code
Jan 26, 2026
Viaarxiv icon

ARC: Active and Reflection-driven Context Management for Long-Horizon Information Seeking Agents

Add code
Jan 17, 2026
Viaarxiv icon

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Add code
Dec 14, 2025
Viaarxiv icon